Video decoder for tiles with absolute signaling

ABSTRACT

A system for decoding a video bitstream includes receiving a reference picture set associated with a frame including a set of reference picture identifiers. The reference picture set identifies one or more reference pictures to be used for inter-prediction of the frame based upon its associated least significant bits of a picture order count based upon the reference picture identifiers. The one or more reference pictures is a second or greater previous frame to the frame having the matching reference picture identifier.

CROSS-REFERENCE TO RELATED APPLICATIONS

None.

BACKGROUND OF THE INVENTION

The present invention relates to video encoding and/or decoding.

Digital video is typically represented as a series of images or frames,each of which contains an array of pixels. Each pixel includesinformation, such as intensity and/or color information. In many cases,each pixel is represented as a set of three colors, each of which may bedefined by eight bit color values.

Video-coding techniques, for example H.264/MPEG-4 AVC (H.264/AVC),typically provide higher coding efficiency at the expense of increasingcomplexity. Increasing image quality requirements and increasing imageresolution requirements for video coding techniques also increase thecoding complexity. Video decoders that are suitable for paralleldecoding may improve the speed of the decoding process and reduce memoryrequirements; video encoders that are suitable for parallel encoding mayimprove the speed of the encoding process and reduce memoryrequirements.

H.264/MPEG-4 AVC [Joint Video Team of ITU-T VCEG and ISO/IEC MPEG,“H.264: Advanced video coding for generic audiovisual services,” ITU-TRec. H.264 and ISO/IEC 14496-10 (MPEG4—Part 10), November 2007], andsimilarly the JCT-VC, [“Draft Test Model Under Consideration”,JCTVC-A205, JCT-VC Meeting, Dresden, April 2010 (JCT-VC)], both of whichare incorporated by reference herein in their entirety, are video codec(encoder/decoder) specifications that decode pictures based uponreference pictures in a video sequence for compression efficiency.

The foregoing and other objectives, features, and advantages of theinvention will be more readily understood upon consideration of thefollowing detailed description of the invention, taken in conjunctionwith the accompanying drawings.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

FIG. 1 illustrates a H.264/AVC video encoder.

FIG. 2 illustrates a H.264/AVC video decoder.

FIG. 3 illustrates an exemplary slice structure.

FIG. 4 illustrates another exemplary slice structure.

FIG. 5 illustrates reconstruction of an entropy slice.

FIG. 6 illustrates reconstruction of an portion of the entropy slice ofFIG. 5.

FIG. 7 illustrates reconstruction of an entropy slice with an omittedLSB count value.

FIG. 8 illustrates reconstruction of an entropy slice with a long termpicture value.

FIG. 9 illustrates reconstruction of an entropy slice by selecting afirst preceding frame with a long term picture value.

FIG. 10 illustrates reconstruction of an entropy slice by usingduplicate long term picture frame having the same least significant bitcount value.

FIGS. 11A-11B illustrates a technique for selecting a reference frame.

FIG. 12 illustrates another technique for selecting a reference frame.

FIGS. 13A-13B illustrates another technique for selecting a referenceframe.

FIG. 14 illustrates another technique for selecting a reference frame.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENT

While any video coder/decoder (codec) that uses encoding/decoding may beaccommodated by embodiments described herein, exemplary embodiments aredescribed in relation to an H.264/AVC encoder and an H.264/AVC decodermerely for purposes of illustration. Many video coding techniques arebased on a block-based hybrid video-coding approach, wherein thesource-coding technique is a hybrid of inter-picture, also consideredinter-frame, prediction, intra-picture, also considered intra-frame,prediction and transform coding of a prediction residual. Inter-frameprediction may exploit temporal redundancies, and intra-frame andtransform coding of the prediction residual may exploit spatialredundancies.

FIG. 1 is a block diagram illustrating an exemplary encoder 104 for anelectronic device 102. It should be noted that one or more of theelements illustrated as included within the electronic device 102 may beimplemented in hardware, and/or software. For example, the electronicdevice 102 includes an encoder 104, which may be implemented in hardwareand/or software.

The electronic device 102 may include a supplier 134. The supplier 134may provide picture or image data (e.g., video) as a source 106 to theencoder 104. Non limiting examples of the supplier 134 include imagesensors, memory, communication interfaces, network interfaces, wirelessreceivers, ports, video frame content, previously encoded video content,non-encoded video content, etc.

The source 106 may be provided to an intra-frame prediction module andreconstruction buffer 140. The source 106 may also be provided to amotion estimation and motion compensation module 166 and to asubtraction module 146.

The intra-frame prediction module and reconstruction buffer 140 maygenerate intra mode information 148 and an intra signal 142 based on thesource 106 and reconstructed data 180. The motion estimation and motioncompensation module 166 may generate inter mode information 168 and aninter signal 144 based on the source 106 and a reference picture buffer196 signal 198.

The reference picture buffer 196 signal 198 may include data from one ormore reference pictures stored in the reference picture buffer 196. Thereference picture buffer 196 may also include an RPS index initializermodule 108. The initializer module 108 may process reference picturescorresponding to the buffering and list construction of an RPS.

The encoder 104 may select between the intra signal 142 and the intersignal 144 in accordance with a mode. The intra signal 142 may be usedin order to exploit spatial characteristics within a picture in an intracoding mode. The inter signal 144 may be used in order to exploittemporal characteristics between pictures in an inter coding mode. Whilein the intra coding mode, the intra signal 142 may be provided to thesubtraction module 146 and the intra mode information 158 may beprovided to an entropy coding module 160. While in the inter codingmode, the inter signal 144 may be provided to the subtraction module 146and the inter mode information 168 may be provided to the entropy codingmodule 160.

Either the intra signal 142 or the inter signal 144 (depending on themode) is subtracted from the source 106 at the subtraction module 146 inorder to produce a prediction residual 148. The prediction residual 148is provided to a transformation module 150. The transformation module150 may compress the prediction residual 148 to produce a transformedsignal 152 that is provided to a quantization module 154. Thequantization module 154 quantizes the transformed signal 152 to producetransformed and quantized coefficients (TQCs) 156.

The TQCs 156 are provided to an entropy coding module 160 and an inversequantization module 170. The inverse quantization module 170 performsinverse quantization on the TQCs 156 to produce an inverse quantizedsignal 172 that is provided to an inverse transformation module 174. Theinverse transformation module 174 decompresses the inverse quantizedsignal 172 to produce a decompressed signal 176 that is provided to areconstruction module 178.

The reconstruction module 178 may produce reconstructed data 180 basedon the decompressed signal 176. For example, the reconstruction module178 may reconstruct (modified) pictures. The reconstructed data 180 maybe provided to a deblocking filter 182 and to the intra predictionmodule and reconstruction buffer 140. The deblocking filter 182 mayproduce a filtered signal 184 based on the reconstructed data 180.

The filtered signal 184 may be provided to a sample adaptive offset(SAO) module 186. The SAO module 186 may produce SAO information 188that is provided to the entropy coding module 160 and an SAO signal 190that is provided to an adaptive loop filter (ALF) 192. The ALF 192produces an ALF signal 194 that is provided to the reference picturebuffer 196. The ALF signal 194 may include data from one or morepictures that may be used as reference pictures.

The entropy coding module 160 may code the TQCs 156 to produce abitstream 114. Also, the entropy coding module 160 may code the TQCs 156using Context-Adaptive Variable Length Coding (CAVLC) orContext-Adaptive Binary Arithmetic Coding (CABAC). In particular, theentropy coding module 160 may code the TQCs 156 based on one or more ofintra mode information 158, inter mode information 168 and SAOinformation 188. The bitstream 114 may include coded picture data. Theencoder often encodes a frame as a sequence of blocks, generallyreferred to as macroblocks.

Quantization, involved in video compression such as HEVC, is a lossycompression technique achieved by compressing a range of values to asingle value. The quantization parameter (QP) is a predefined scalingparameter used to perform the quantization based on both the quality ofreconstructed video and compression ratio. The block type is defined inHEVC to represent the characteristics of a given block based on theblock size and its color information. QP, resolution information andblock type may be determined before entropy coding. For example, theelectronic device 102 (e.g., the encoder 104) may determine the QP,resolution information and block type, which may be provided to theentropy coding module 160.

The entropy coding module 160 may determine the block size based on ablock of TQCs 156. For example, block size may be the number of TQCs 156along one dimension of the block of TQCs. In other words, the number ofTQCs 156 in the block of TQCs may be equal to block size squared. Forinstance, block size may be determined as the square root of the numberof TQCs 156 in the block of TQCs. Resolution may be defined as a pixelwidth by a pixel height. Resolution information may include a number ofpixels for the width of a picture, for the height of a picture or both.Block size may be defined as the number of TQCs 156 along one dimensionof a 2D block of TQCs.

In some configurations, the bitstream 114 may be transmitted to anotherelectronic device. For example, the bitstream 114 may be provided to acommunication interface, network interface, wireless transmitter, port,etc. For instance, the bitstream 114 may be transmitted to anotherelectronic device via LAN, the Internet, a cellular phone base station,etc. The bitstream 114 may additionally or alternatively be stored inmemory on the electronic device 102 or other electronic device.

FIG. 2 is a block diagram illustrating an exemplary decoder 212 on anelectronic device 202. The decoder 212 may be included for an electronicdevice 202. For example, the decoder 212 may be a HEVC decoder. Thedecoder 212 and/or one or more of the elements illustrated as includedin the decoder 212 may be implemented in hardware and/or software. Thedecoder 212 may receive a bitstream 214 (e.g., one or more encodedpictures included in the bitstream 214) for decoding. In someconfigurations, the received bitstream 214 may include received overheadinformation, such as a received slice header, received PPS (or pictureparameter set), received buffer description information, etc. Theencoded pictures included in the bitstream 214 may include one or moreencoded reference pictures and/or one or more other encoded pictures.

Received symbols (in the one or more encoded pictures included in thebitstream 214) may be entropy decoded by an entropy decoding module 268,thereby producing a motion information signal 270 and quantized, scaledand/or transformed coefficients 272.

The motion information signal 270 may be combined with a portion of areference frame signal 298 from a frame memory 278 at a motioncompensation module 274, which may produce an inter-frame predictionsignal 282. The quantized, descaled and/or transformed coefficients 272may be inverse quantized, scaled and inverse transformed by an inversemodule 262, thereby producing a decoded residual signal 284. The decodedresidual signal 284 may be added to a prediction signal 292 to produce acombined signal 286. The prediction signal 292 may be a signal selectedfrom either the inter-frame prediction signal 282 or an intra-frameprediction signal 290 produced by an intra-frame prediction module 288.In some configurations, this signal selection may be based on (e.g.,controlled by) the bitstream 214.

The intra-frame prediction signal 290 may be predicted from previouslydecoded information from the combined signal 292 (in the current frame,for example). The combined signal 292 may also be filtered by ade-blocking filter 294. The resulting filtered signal 296 may be writtento frame memory 278. The resulting filtered signal 296 may include adecoded picture.

The frame memory 778 may include a DPB (or display picture buffer) asdescribed herein. The DPB may include one or more decoded pictures thatmay be maintained as short or long term reference frames. The framememory 278 may also include overhead information corresponding to thedecoded pictures. For example, the frame memory 278 may include sliceheaders, PPS information, buffer description information, etc. One ormore of these pieces of information may be signaled from an encoder(e.g., encoder 104). The frame memory 278 may provide a decoded picture718.

An input picture comprising a plurality of macroblocks may bepartitioned into one or several slices. The values of the samples in thearea of the picture that a slice represents may be properly decodedwithout the use of data from other slices provided that the referencepictures used at the encoder and the decoder are the same and thatde-blocking filtering does not use information across slice boundaries.Therefore, entropy decoding and macroblock reconstruction for a slicedoes not depend on other slices. In particular, the entropy coding statemay be reset at the start of each slice. The data in other slices may bemarked as unavailable when defining neighborhood availability for bothentropy decoding and reconstruction. The slices may be entropy decodedand reconstructed in parallel. No intra prediction and motion-vectorprediction is preferably allowed across the boundary of a slice. Incontrast, de-blocking filtering may use information across sliceboundaries.

FIG. 3 illustrates an exemplary video picture 90 comprising elevenmacroblocks in the horizontal direction and nine macroblocks in thevertical direction (nine exemplary macroblocks labeled 91-99). FIG. 3illustrates three exemplary slices: a first slice denoted “SLICE #0” 89,a second slice denoted “SLICE #1” 88 and a third slice denoted “SLICE#2” 87. An H.264/AVC decoder may decode and reconstruct the three slices87, 88, 89 in parallel. Each of the slices may be transmitted in scanline order in a sequential manner. At the beginning of thedecoding/reconstruction process for each slice, entropy decoding 268 isinitialized or reset and macroblocks in other slices are marked asunavailable for both entropy decoding and macroblock reconstruction.Thus, for a macroblock, for example, the macroblock labeled 93, in“SLICE #1,” macroblocks (for example, macroblocks labeled 91 and 92) in“SLICE #0” may not be used for entropy decoding or reconstruction.Whereas, for a macroblock, for example, the macroblock labeled 95, in“SLICE #1,” other macroblocks (for example, macroblocks labeled 93 and94) in “SLICE #1” may be used for entropy decoding or reconstruction.Therefore, entropy decoding and macroblock reconstruction proceedsserially within a slice. Unless slices are defined using a flexiblemacroblock ordering (FMO), macroblocks within a slice are processed inthe order of a raster scan.

Flexible macroblock ordering defines a slice group to modify how apicture is partitioned into slices. The macroblocks in a slice group aredefined by a macroblock-to-slice-group map, which is signaled by thecontent of the picture parameter set and additional information in theslice headers. The macroblock-to-slice-group map consists of aslice-group identification number for each macroblock in the picture.The slice-group identification number specifies to which slice group theassociated macroblock belongs. Each slice group may be partitioned intoone or more slices, wherein a slice is a sequence of macroblocks withinthe same slice group that is processed in the order of a raster scanwithin the set of macroblocks of a particular slice group. Entropydecoding and macroblock reconstruction proceeds serially within a slicegroup.

FIG. 4 depicts an exemplary macroblock allocation into three slicegroups: a first slice group denoted “SLICE GROUP #0” 86, a second slicegroup denoted “SLICE GROUP #1” 85 and a third slice group denoted “SLICEGROUP #2” 84. These slice groups 84, 85, 86 may be associated with twoforeground regions and a background region, respectively, in the picture90.

A picture may be partitioned into one or more slices, wherein a slicemay be self-contained in the respect that values of the samples in thearea of the picture that the slice represents may be correctlyreconstructed without use of data from other slices, provided that thereferences pictures used are identical at the encoder and the decoder.All reconstructed macroblocks within a slice may be available in theneighborhood definition for reconstruction.

A slice may be partitioned into more than one entropy slice, wherein anentropy slice may be self-contained in the respect that the area of thepicture that the entropy slice represents may be correctly entropydecoded without the use of data from other entropy slices. The entropydecoding 268 may be reset at the decoding start of each entropy slice.The data in other entropy slices may be marked as unavailable whendefining neighborhood availability for entropy decoding.

A device configured for decoding pictures obtains or otherwise receivesa bitstream that includes a series of pictures, including a currentpicture. The device further obtains a reference picture set (RPS)parameter that may be used for the identification of other frames thatmay be used for the decoding of the current picture or for the decodingof pictures subsequent to the current picture in the order that picturesare signaled in the bitstream.

A RPS provides an identification of a set of reference picturesassociated with the current frame. A RPS may identify reference picturesthat are prior to the current picture in display order that may be usedfor inter prediction of the current picture and/or identify referencepictures that are after the current picture in display order that may beused for inter prediction of the current picture. For example, if thesystem receives frame 1, 3, 5 and 5 uses 3 for reference, and, anencoder uses frame 1 for the prediction of frame 7. Then, the RPS for 5may signal to keep both frame 3 and 1 in the frame memory 278 eventhough frame 1 is not used for reference of frame 5. In one embodiment,the RPS for 5 may be [−2 −4]. Additionally, the frame memory 278 may bereferred to the display picture buffer, or equivalently DPB. For thisexample, the frame number corresponds to the display order, or outputorder, of the frames.

A RPS describes one or more reference pictures that should bemaintained, at least for a limited time duration, in the decoded picturebuffer (DPB) for subsequent use. This identification of the RPS may beincluded in the slice header of each picture, together with a picture,and/or together with a group of pictures. In one embodiment, a list ofRPS may be sent in a picture parameter set (PPS). Then, the slice headermay identify one of the RPS sent in the PPS to be used for the slice.For example, a RPS for a group of pictures may be signaled in a pictureparameter set (PPS). Any pictures in the DPB that are not a part of theRPS for the current frame may be marked as “unused for reference.”

A DPB may be used to store reconstructed (e.g., decoded) pictures at thedecoder. These stored pictures may then be used, for example, in aninter-prediction technique. Also, a picture in the DPB may be associatedwith a picture order count (POC). The POC may be a variable that isassociated with each encoded picture and that has a value that increaseswith increasing picture position in an output order. In other words, thePOC may be used by the decoder to deliver the pictures in the correctorder for display. The POC may also be used for identification ofreference pictures during construction of a reference picture list andidentification of decoded reference pictures. Furthermore, the POC maybe used for identification of pictures that are lost during transmissionfrom an encoder to a decoder.

Referring to FIG. 5, one example of a set of frames 300 provided from anencoder to a decoder is illustrated. Each of the frames may have anassociated POC 310. As illustrated, the POC may increment from a minusnumber though a large positive number. In some embodiments, the POC mayonly increment from zero through a larger positive number. The POC istypically incremented by one for each frame, but in some cases one ormore POC are skipped or otherwise omitted. For example, the POC for aset of frames in the encoder may be, 0, 1, 2, 3, 4, 5, etc. For example,the POC for the same or another set of frames in the encoder may be, 0,1, 2, 4, 5, etc., with POC 3 being skipped or otherwise omitted.

As the POC becomes sufficiently large, a significant number of bitswould be necessary to identify each frame using the POC. The encoder mayreduce the number of bits used to identify a particular POC by using aselected number of least significant bits (LSB) of the POC to identifyeach frame, such as 4 bits. Since the reference frames used for decodingthe current frame are often temporally located proximate to the currentframe, this identification technique is suitable and results in areduction in the computational complexity of the system and an overallreduction in the bit rate of the video. The number of LSB to use toidentify the pictures may be signaled in the bit stream to the decoder.

As illustrated, with LSB being 4 bits, the LSB index repeats every 16values (2̂4) when the selected number of LSB of the POC is 4. Thus, frame0 has a LSB having a value of 0, frame 1 has a LSB having a value of 1,. . . , frame 14 has a LSB having a value of 14, frame 15 has a LSBhaving a value of 15. However, frame 16 again has a LSB having a valueof 0, frame 17 again has a LSB having a value of 1, and frame 20 has aLSB having a value of 4. The LSB identifier (generally also referred toas the LSB of the POC or, equivalently, POC LSB) may have thecharacteristic of LSB=POC % 16, where % is the remainder after dividingby 16 (2̂ number of least significant bits which in this case is 4).Similarly, if the selected number of LSBs to identify a POC is N bits,the LSB identifier may have the characteristic of LSB=POC % (2̂N) where2̂N denotes 2 raised to the power of N. Rather than including the POC thebitstream to identify frames, the encoder preferably provides the LSBindex (generally also referred to as the LSB of the POC or,equivalently, POC LSB), in the bitstream to the decoder.

The reference frames used for inter prediction of a current frame, orframes subsequent to the current frame, may be identified with an RPSusing either relative (e.g., delta) referencing (using the differencebetween POC values, or alternatively a deltaPOC and a currentPOC, forexample) or absolute referencing (using the POC, for example). In someembodiments, frames identified with relative referencing may be called ashort term reference frame, and frames identified with an absolutereferencing may be called a long term reference frame. For example, theframe identified by POC 5 310 and signaled to the decoder as LSB 5 320in the bitstream may have an associated RPS 330 of [−5, −2, −1]. Themeaning of the RPS values is described later.

Referring to FIG. 6, illustrating a portion of FIG. 5, the RPS of [−5,−2, −1] refers to frames that include the fifth previous frame 320,second previous frame 321, and first previous frame 322 relative to thecurrent frame. This in turn refers to the POC values of 0, 3, and 4,respectively as illustrated in FIG. 6 for the current frame with POCvalue of 5. Typically, the RPS refers to the difference in between thePOC value of the current frame and the POC value of the previous frame.For example, the RPS of [−5, −2, −1] for a current frame having a POCvalue of 5, refers to frames having POC values of 5 minus 5=0; 5 minus2=3; and 5 minus 1=4. The RPS can also include frames in the future.These may be indicated with positive values in the RPS (positivedeltaPOC values)

In the case that the POC values are not sequential, such as in the casethat one or more POC values are skipped or otherwise omitted in parts ofthe bitstream, the difference between the POC value of the current frameand POC value of the previous frame may be different than the number offrames output between the previous frame and current frame such asillustrated in FIG. 7. As shown in FIG. 7, the RPS of [−5, −2, −1]refers to frames that include the fifth previous frame 320, secondprevious frame 321, and first previous frame 322 relative to the POC ofthe frame identified with POC value equal to 5. The RPS may be signaledin the bitstream in any suitable manner, such as provided together withthe frame or provided together with a set of frames.

Referring to FIG. 8, another technique for signaling the referenceframes is to use an absolute reference, generally referred to as a longterm picture, in the RPS associated with a frame. The decoding process,such as the motion vector prediction technique, may be differentdepending if the reference frame is signaled using an absolute referenceor a relative reference. The absolute reference (referred to as LT forconvenience) refers to a particular LSB count value associated with areference frame, such as a previous or subsequent frame. For example,the absolute reference of LT=3 (LT3) would refer to a reference framehaving a POC LSB value of 3. Accordingly, a RPS of [LT3, −5] would referto a reference frame having POC LSB value of 3 and a reference framewith a POC equal to the POC of the current frame minus 5. In FIG. 8,this corresponds to the reference frame with POC equal to 3 444 and thereference frame with POC equal to 0 320. Typically, the LT3 refers tothe first previous frame relative to the current frame having a POC LSBvalue of 3. In one embodiment, LT3 refers to the first previous framerelative to the current frame in output order having a POC LSB value of3. In a second embodiment, LT3 refers to the first previous framerelative to the current frame in transmission order having a POC LSBvalue of 3. While such a system is suitable for many bit streams, it isnot sufficiently robust to select a frame with a LSB count value of 3that is different than the immediately previous frame having a LSB countvalue of 3.

Referring to FIG. 9, for example, if the encoder was encoding frame 31(POC=31) and the system signals the use of the long term picture withPOC LSB=0 (LT0), then this would refer A to frame 16 (POC=16) since itis the first previous frame with LSB=0. However, the encoder may desireto signal the long term picture frame 0, which likewise has a POC LSBcount value of 0, but this may not be accomplished with such a firstprevious referencing scheme. To overcome this limitation, one techniqueis to increase the number of least significant bits used to signal thelong term frame POC LSB. While such an increase in the number of leastsignificant bits is possible, it results in substantial additional bitsbeing added to the bitstream.

A more preferred technique that results in fewer additional bits beingadded to the bitstream is to signal a different long term picture thanthe first immediately preceding frame with a corresponding POC LSBvalue. For example, the system could indicate the RPS of the currentframe having an absolute reference as [LT0|2] where the 0 refers to thePOC LSB value and 2 refers to which of the previous frames with POC LSBvalue equal to 0 to usc, which in this case would be the second previousPOC LSB value of 0 (e.g., frame 0 in FIG. 9). If no second reference isincluded then the system may default to the immediately preceding framewith a POC LSB=0 [LT0] (e.g., frame 16 in FIG. 9).

In many cases, the frequency of occurrence of the desire to signal aframe that is not the first immediately preceding frame with thecorresponding POC LSB value using absolute referencing will berelatively infrequent. To further reduce the overall bit rate indicatingwhich frame to use, while permitting the capability of signaling adifferent frame than the first immediately preceding frame with thecorresponding POC LSB value using absolute referencing, the system mayuse a duplication technique. For example, the RPS may be structured asfollows, [LT0, LT013]. The duplication of the LT0 within the same RPSsignals the decoder to use a different frame having a POC LSB value of0, which in this case would be the third previous occurrence of the POCLSB value of 0. In general, aside from the potential that a particularPOC LSB value would not be included in a particular cycle of POC LSBvalues, the desired POC LSB value will correspond to a frame of theindicated previous occurrence. Here, a cycle of POC LSB values denotes aset of frames that when ordered in output order do not contain the samePOC LSB value and are not separated in output order by frames not in theset.

Referring to FIG. 10, the duplication technique may be indicated asfollows. The RPS includes a signal of a long term picture having a POCLSB value 400 (e.g., [LT3]). The same RPS includes another signal of along term picture having the same POC LSB value 410 (e.g., [LT3, LT3].The same RPS includes another signal of the second long term picturehaving the same LSB count value 410 indicating the location of thedesired frame 420 [LT3, LT3|2].

The signaling of the location of the desired frame may be performed inany suitable manner. Referring to FIGS. 11A-11B for example, thelocation may be one or more previous cycles of the POC LSB values forthe desired frame relative to the current frame, such as the thirdprevious cycle. Referring to FIG. 12 for example, the location may bebased upon an absolute number of frames offset from the current frame.Referring to FIGS. 13A-13B for example, the location may be one or moreprevious cycles of the POC LSB values relative to the first immediatelypreceding frame with the desired POC LSB value. Referring to FIG. 14 forexample, the location may be based upon an absolute number of framesoffset relative to the first immediately preceding frame with thedesired POC LSB value.

One exemplary implementation of such a technique may use the followingsyntax.

slice_header( ) { Descriptor  lightweight_slice_flag u(1)  if(!lightweight_slice_flag ) {   slice_type ue(v)   pic_parameter_set_idue(v)   if( IdrPicFlag ) {     idr_pic_id ue(v)   no_output_of_prior_pics_flag u(1)   }   else {     pic_order_cnt_lsbu(v)    short_term_ref_pic_set_pps_flag u(1)    if(!short_term_ref_pic_set_pps_flag )     short_term_ref_pic_set(num_short_term_ref_pic_sets )    else     short_term_ref_pic_set_idxu(v)    if( long_term_ref_pics_present_flag ) {     num_long_term_picsue(v)     for( i = 0; i < num_long_term_pics; i++ ) {     delta_poc_lsb_lt_minus1[ i ] ue(v)      if(deltaPOCLSBCheck(i)==1){       delta_poc_msb_lt_minus1[ i ] ue(v)       }     used_by_curr_pic_lt_flag[ i ] u(1)     }    }   }   if( slice_type= = P || slice_type = = B ) {     num_ref_idx_active_override_flag u(1)    if( num_ref_idx_active_override_flag ) {     num_ref_idx_l0_active_minus1 ue(v)      if( slice_type = = B )     num_ref_idx_l1_active_minus1 ue(v)     }    } ... }

When the lightweight_slice_flag is equal to 1 specifies that the valueof slice header syntax elements not present shall be inferred to beequal to the value of slice header syntax elements in a proceedingslice, where a proceeding slice is defined as the slice containingtreeblock with location (LCUAddress−1). The lightweight_slice_flag shallbe equal to 0 when LCUAddress equal to 0. Here, a treeblock may be amacroblock and LCUAddress denotes the spatial location of the treeblockwithin a picture.

The slice_type specifies the coding type of the slice as follows:

slice_type Name of slice_type 0 P (P slice) 1 B (B slice) 2 I (I slice)

When nal_unit_type is equal to 5 (IDR picture), slice_type shall beequal to 2. When max_num_ref_frames is equal to 0, slice_type shall beequal to 2.

pic_parameter_set_id specifies the picture parameter set in use. Thevalue of pic_parameter_set_id shall be in the range of 0 to 255,inclusive.

idr_pic_id identifies an IDR picture, which denotes a picture that doesnot use previously transmitted pictures for reference. The values ofidr_pic_id in all the slices of an IDR picture shall remain unchanged.When two consecutive access units in decoding order are both IDR accessunits, the value of idr_pic_id in the slices of the first such IDRaccess unit shall differ from the idr_pic_id in the second such IDRaccess unit. The value of idr_pic_id shall be in the range of 0 to65535, inclusive.

no_output_of_prior_pics_flag specifies how the previously-decodedpictures in the decoded picture buffer are treated after decoding of anIDR picture. When the IDR picture is the first IDR picture in thebitstream, the value of no_output_of_prior_pics_flag has no effect onthe decoding process. When the IDR picture is not the first IDR picturein the bitstream and the value of pic_width_in_luma_samples orpic_height_in_luma_samples, which denote the dimensions of the pictures,or max_dec_frame_buffering, which denotes the maximum amount ofreordering required at a decoder to convert a sequence of frames intransmission order to a sequence of frames in display order, derivedfrom the active sequence parameter set is different from the value ofpic_width_in_luma_samples or pic_height_in_luma_samples ormax_dec_frame_buffering derived from the sequence parameter set activefor the preceding picture, no_output_of_prior_pics_flag equal to 1 may(but should not) be inferred by the decoder, regardless of the actualvalue of no_output_of_prior_pics_flag.

pic_order_cnt_lsb specifies the picture order count moduloMaxPicOrderCntLsb for the current picture. The length of thepic_order_cnt_lsb syntax element is log 2_max_pic_order_cnt_lsb_minus4+4bits. The value of the pic_order_cnt_lsb shall be in the range of 0 toMaxPicOrderCntLsb−1, inclusive. When pic_order_cnt_lsb is not present,pic_order_cnt_lsb shall be inferred to be equal to 0. Here,pic_order_cnt_lsb indicates the number of LSBs in POC LSB.

short_term_ref_pic_set_pps_flag equal to 1 specifies that the short-termreference picture set of the current picture shall be created usingsyntax elements in the active picture parameter set, which containssyntax elements that may be shared between multiple pictures.short_term_ref_pic_set_pps_flag equal to 0 specifies that the short-termreference picture set of the current picture shall be created usingsyntax elements in the short_term_ref_pic_set( )syntax structure in theslice header. In some embodiments, a short-term reference picture setdenotes a pictures set that only uses delta referencing.

short_term_ref_pic_set_idx specifies the index to the list of theshort-term reference picture sets specified in the active pictureparameter set that shall be used for creation of the reference pictureset of the current picture. The syntax elementshort_term_ref_pic_set_idx shall be represented by ceil(log2(num_short_term_ref_pic_sets)) bits. The value ofshort_term_ref_pic_set_idx shall be in the range of 0 tonum_short_term_ref_pic_sets−1, inclusive, wherenum_short_term_ref_pic_sets is the syntax element from the activepicture parameter set.

The variable StRpsIdx is derived as follows.

If( short_term_ref_pic_set_pps_flag ) StRpsIdx =short_term_ref_pic_set_idx ELSE StRpsIdx = num_short_term_ref_pic_sets

num_long_term_pics specifies the number of the long-term referencepictures that are to be included in the long-term reference picture setof the current picture. The value of num_long_term_pics shall be in therange of 0 tomax_num_ref_frames−NumNegativePics[StRpsIdx]−NumPositivePics[StRpsIdx],inclusive. When not present, the value of num_long_term_pics shall beinferred to be equal to 0. In some embodiments, the long-term referencepictures denote reference pictures that are transmitted with absolutereferencing.

delta_poc_lsb_lt_minus1[i] is used to determine the value of the leastsignificant bits of the picture order count value of the i-th long-termreference picture that is included in the long-term reference pictureset of the current picture. delta_poc_lsb_lt_minus1[i] shall be in therange of 0 to MaxPicOrderCntLsb−1, inclusive. In some embodiments,delta_poc_lsb_lt_minus1[i] denotes POC LSB of the i-th long-termreference picture.

The variable DeltaPocLt[i] is derived as follows.

If (i= = 0) DeltaPocLt[ i ] = delta_poc_lsb_lt_minus1[ i ] + 1 ElseDeltaPocLt[ i ] = delta_poc_lsb_lt_minus1[ i ] + 1 + DeltaPocLt[ i − 1 ]

The value of DeltaPocLt[i] shall be in the range of 0 toMaxPicOrderCntLsb, inclusive.

deltaPOCLSBCheck(i) is a function as follows:

deltaPOCLSBCheck(int i) { for(m=0;m<i;m++) {if(delta_poc_lsb_lt_minus1[i]==delta_poc_lsb_lt_minus1[m]) { return 1; }} return 0; }

delta_poc_msb_lt_minus1[i] is together with delta_poc_lsb_lt_minus1 [i]used to determine the value of picture order count of the i-th long termreference picture that is included in the long-term reference pictureset of the current reference picture.

The variable delta_poc_msb_lt_minus1[i] is derived as follows:

for(n=0;n<i;n++) { deltaNumSameLSBs=0;if(delta_poc_lsb_lt_minus1[i]==delta_poc_lsb_lt_minus1[n]) {if(deltaNumSameLSBs==0) {delta_poc_msb_lt_minus1[i]=PicOrderCntMsb[i]−1; deltaNumSameLSBs++; }else { delta_poc_msb_lt_minus1[i]=PicOrderCntMsb[i]−delta_poc_msb_lt_minus1 [n−1]; } } }

In an alternative embodiment instead of sending elementdelta_poc_msb_lt_minus1 when the delta_poc_lsb_lt_minus1 values aresame, a poc_msb_lt_minus1 or poc_msb_lt element may be sent. Herepoc_msb_lt_minus1 indicates POC value of the reference picture−1. Thismay be absolute POC value. Similarly poc_msb_lt indicates POC value ofreference picture. Again this may be absolute POC value.

used_by_curr_pic_lt_flag[i] equal to 0 specifies that the i-th long-termreference picture included in the long-term reference picture set of thecurrent picture is not used for reference, or inter-frame prediction, bythe current picture.

num_ref idx_active_override_flag equal to 1 specifies that the syntaxelement num_ref idx_l0_active_minus1 is present for P and B slices andthat the syntax element num_ref_idx_l1_active_minus1 is present for Bslices. num_ref_idx_active_override_flag equal to 0 specifies that thesyntax elements num_ref_idx_l0_active_minus1 andnum_ref_idx_l1_active_minus1 are not present.

When the current slice is a P or B slice and field_pic_flag is equal to0 and the value of num_ref_idx_l0_default_active_minus1 in the pictureparameter set exceeds 15, num_ref_idx_active_override_flag shall beequal to 1.

When the current slice is a B slice and field_pic_flag is equal to 0 andthe value of num_ref_idx_l1_default_active_minus1 in the pictureparameter set exceeds 15, num_ref_idx_active_override_flag shall beequal to 1.

num_ref idx_l0_active_minus1 specifies the maximum reference index forreference picture list 0 that shall be used to decode the slice.

When the current slice is a P or B slice andnum_ref_idx_l0_active_minus1 is not present,num_ref_idx_l0_active_minus1 shall be inferred to be equal tonum_ref_idx_l0_default_active_minus1.

The range of num_ref_idx_l0_active_minus1 is specified as follows:

If field_pic_flag is equal to 0, num_ref_idx_l0_active_minus1 shall bein the range of 0 to 15, inclusive. When MbaffFrameFlag is equal to 1,num_ref idx_l0_active_minus1 is the maximum index value for the decodingof frame macroblocks and 2*num_ref_idx_l0_active_minus1+1 is the maximumindex value for the decoding of field macroblocks.

Otherwise (field_pic_flag is equal to 1), num_ref_idx_l0_active_minus1shall be in the range of 0 to 31, inclusive.

num_ref_idx_μl_active_minus1 specifies the maximum reference index forreference picture list 1 that shall be used to decode the slice.

When the current slice is a B slice and num_ref_idx_l1_active_minus1 isnot present, num_ref_idx_l1_active_minus1 shall be inferred to be equalto num_ref_idx_l1_default_active_minus1.

The range of num_ref_idx_l1_active_minus1 is constrained as specified inthe semantics for num_ref_idx_l0_active_minus1 with l0 and list 0replaced by l1 and list 1, respectively.

The operation deltaPOCLSBCheck(int i) determines the same POC LSB istransmitted from the encoder to the decoder using absolute referencingfor the current frame. In an alternative embodiment, determining if thesame POC LSB is transmitted can be accomplished by checking if the valuedelta_poc_lsb_lt_minus1 is equal to a value known to both the encoderand decoder. For example, delta_poc_lsb_lt_minus1 equal to 0 coulddenote the POC LSB is the same as the previously transmitted POC LSB.Alternatively, delta_poc_lsb_lt_minus1 equal to 2̂N−1, where N denotesthe number of bits used to transmit POC LSB and known to the both theencoder and decoder, 0 could denote the POC LSB is the same as thepreviously transmitted POC LSB. In alternative embodiments, the valuedelta_poc_lsb_lt_minus1 is replaced with the syntax elementdelta_poc_lsb_lt, which is generally equal to delta_poc_lsb_lt_minus1plus 1. In these embodiments, the delta_poc_lsb_lt equal to a valueknown to both the encoder and decoder can indicate the picturetransmitted using absolute referencing has the same POC LSB as theprevious picture transmitted using absolute referencing in the same RPS.For example, delta_poc_lsb_lt equal to 0 could denote the POC LSB is thesame as the previously transmitted POC LSB. Alternatively,delta_poc_lsb_lt equal to 2̂N, where N denotes the number of bits used totransmit POC LSB and known to the both the encoder and decoder, 0 coulddenote the POC LSB is the same as the previously transmitted POC LSB.

For long term reference picture set the decoding process may be done asfollows:

for( i = 0, j = 0, k = 0; i < num_long_term_pics; i++ ) { PocMSB=0;if(deltaPOCLSBCheck(i)==0) { for(n=0;n<i;n++) { PocMSB=0;deltaNumSameLSBs=0;if(delta_poc_lsb_lt_minus1[i]==delta_poc_lsb_lt_minus1[n]) {if(deltaNumSameLSB==0) { PocMSB =delta_poc_msb_lt_minus1[i];deltaNumSameLSBs++; } else { PocMSB +=delta_poc_msb_lt_minus1 [n]; } } }} if( used_by_curr_pic_lt_flag[ i ] ) PocLtCurr[ j++ ] = PocMSB+((PicOrderCntVal − DeltaPocLt[ i ] + MaxPicOrderCntLsb ) %MaxPicOrderCntLsb) else PocLtFoll[ k++ ] = PocMSB+(( PicOrderCntVal −DeltaPocLt[ i ] + MaxPicOrderCntLsb ) % MaxPicOrderCntLsb) }

The terms and expressions which have been employed in the foregoingspecification are used therein as terms of description and not oflimitation, and there is no intention, in the use of such terms andexpressions, of excluding equivalents of the features shown anddescribed or portions thereof, it being recognized that the scope of theinvention is defined and limited only by the claims which follow.

1-8. (canceled)
 9. A method for decoding a video bitstream comprising:decoding a current picture by using inter prediction based on areference picture set; and storing said decoded picture to be referredfor future inter prediction, wherein said reference picture set isdecoded by using at least: (a) a selected number of least significantbits (LSB) of a picture order count (POC) of a reference picture; and(b) a signal to specify whether or not subsequent data to determine aMSB of the POC for said reference picture exists.
 10. The method ofclaim 9, wherein said subsequent data is based on a difference betweenthe MSBs of two POCs.
 11. The method of claim 9, wherein said subsequentdata is a value of the LSB of the POC of an i-th reference picture thatis included in said reference picture set of said current picture.
 12. Amethod for encoding a video bitstream comprising: encoding a currentpicture using inter prediction based on a reference picture set, whereinsaid reference picture set is encoded by using at least: (a) one or morereference picture identifiers each of which being based on a selectednumber of least significant bits (LSB) of a picture order count (POC)for a reference picture; and (b) a signal to specify whether or notsubsequent data to determine a MSB of said POC exists.
 13. The method ofclaim 12, wherein said subsequent data is based on the differencebetween the MSBs of two POCs.
 14. The method of claim 12, wherein saidsubsequent data is a value of the LSB of the POC of an i-th referencepicture that is included in said reference picture set of said currentpicture.